A Word-Embedding-based Sense Index for Regular Polysemy Representation

نویسندگان

  • Marco Del Tredici
  • Núria Bel
چکیده

We present a method for the detection and representation of polysemous nouns, a phenomenon that has received little attention in NLP. The method is based on the exploitation of the semantic information preserved in Word Embeddings. We first prove that polysemous nouns instantiating a particular sense alternation form a separate class when clustering nouns in a lexicon. Such a class, however, does not include those polysemes in which a sense is strongly predominant. We address this problem and present a sense index that, for a given pair of lexical classes, defines the degree of membership of a noun to each class: polysemy is hence implicitly represented as an intermediate value on the continuum between two classes. We finally show that by exploiting the information provided by the sense index it is possible to accurately detect polysemous nouns in the dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regular polysemy: from sense vectors to sense patterns

Regular polysemy was extensively investigated in lexical semantics, but this phenomenon has been very little studied in distributional semantics. We propose a model for regular polysemy detection that is based on sense vectors and allows to work directly with senses in semantic vector space. Our method is able to detect polysemous words that have the same regular sense alternation as in a given...

متن کامل

Joint Learning of Sense and Word Embeddings

Methods for learning lower-dimensional representations (embeddings) of words using unlabelled data have received a renewed interested due to their myriad success in various Natural Language Processing (NLP) tasks. However, despite their success, a common deficiency associated with most word embedding learning methods is that they learn a single representation for a word, ignoring the different ...

متن کامل

Syntactico Semantic Word Representations in Multiple Languages

Our project is an extension of the project “Syntactico Semantic Word Representations in Multiple Languages”[1]. The previous project aims to improve the semantical representation of English vocabulary via incorporating the local context with global context and supplying homonymy and polysemy for multiple embeddings per word. It also introduces a new neural network architecture that learns the w...

متن کامل

Rules, Radical Pragmatics and Restrictions on Regular Polysemy

Although regular polysemy [e.g. producer for product (John read Dickens) or container for contents (John drank the bottle)] has been extensively studied, there has been little work on why certain polysemy patterns are more acceptable than others. We take an empirical approach to the question, in particular evaluating an account based on rules against a gradient account of polysemy that is based...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015